AWS Auto Scaling
Learn about Auto Scaling and scaling based on SQS.
Amazon EC2 Auto Scaling#
AWS Auto Scaling refers to a collection of Auto Scaling capabilities across several AWS services. It monitors your applications and automatically adjusts the capacity to maintain steady, predictable performance at the lowest possible cost.
AWS Auto Scaling refers to a collection of Auto Scaling capabilities across several AWS services.
The services within the AWS Auto Scaling family include:
- Amazon EC2 (known as Amazon EC2 Auto Scaling)
- Amazon ECS
- Amazon DynamoDB
- Amazon Aurora
Auto Scaling also works with ELB, CloudWatch, and CloudTrail.
General Auto Scaling concepts#
Amazon EC2 Auto Scaling helps you to ensure that you have the correct number of Amazon EC2 instances available to handle the load for your application. Availability, cost, and system metrics can all factor into scaling.
You create collections of EC2 instances, called Auto Scaling groups, and Auto Scaling automatically provides horizontal scaling (scale-out) for your instances. It is triggered by an event of scaling action to either launch or terminate instances.
Auto Scaling is a region-specific service; however, it can span multiple AZs within the same AWS Region. It will try to distribute EC2 instances evenly across AZs. Auto Scaling can be configured from the Console, CLI, SDKs, and APIs.
The following slides give an overview of how Auto Scaling works.
1 of 5
2 of 5
3 of 5
4 of 5
5 of 5
Billing
There is no additional cost for Auto Scaling; you just pay for the resources (EC2 instances) provisioned.
Launch configurations#
You can determine which subnets Auto Scaling will launch new instances into. A launch configuration is the template used to create new EC2 instances and includes parameters such as instance family, instance type, AMI, key pair, and security groups. You cannot edit a launch configuration once defined.
-
A launch configuration can be created from the AWS console or CLI.
-
You can create a new launch configuration.
-
Or, you can use an existing running EC2 instance to create the launch configuration.
- The AMI must exist on EC2.
- EC2 instance tags and any additional block store volumes created after the instance launch will not be taken into account.
-
If you want to change your launch configurations, you have to create a new one, make the required changes, and use that with your Auto Scaling groups.
You can use a launch configuration with multiple Auto Scaling groups (ASG).
Auto Scaling groups#
An ASG is a logical grouping of EC2 instances managed by an Auto Scaling Policy. An ASG can be edited once defined.
You can attach one or more target groups to your ASG to include instances behind an ALB. You also can attach one or more classic ELBs to your existing ASG; however, the ELBs must be in the same region. Once you do this, any EC2 instance existing or added by the ASG will be automatically registered with the ASG defined ELBs. If adding an instance to an ASG would result in exceeding the maximum capacity of the ASG, the request will fail.
You can add a running instance to an ASG if the following conditions are met:
- The instance is in a running state.
- The AMI used to launch the instance still exists.
- The instance is not part of another ASG.
- The instance is in the same AZs for the ASG.
Scaling#
The scaling options define the triggers and when instances should be provisioned/de-provisioned.
There are four scaling options:
- Maintain: Keep a specific or minimum number of instances running.
- Manual: Use maximum, minimum, or a specific number of instances.
- Scheduled: Increase or decrease the number of instances based on a schedule.
- Dynamic: Scale based on real-time system metrics (e.g., CloudWatch metrics).
The following table summarizes scaling options:
| Scaling Type | What it is | When to use |
|---|---|---|
| Maintain | Ensures that the required number of instances are running | Use when you always need a known number of instances running at all times |
| Manual | Manually change desired capacity via the console or CLI | Use when your needs change rarely enough that you’re OK with making manual changes |
| Scheduled | Adjust min/max instances on specific dates/times or recurring time periods | Use when you know your busy and quiet times. Useful for ensuring enough instances are available before very busy times |
| Dynamic | Scale in response to system load or other triggers using metrics | Useful for changing capacity based on system utilization, e.g., CPU hits 80% |
The scaling options are configured through scaling policies which determine when, if, and how the ASG scales and shrinks.
The following table describes the scaling policy types available for dynamic scaling policies and when to use them:
| Scaling policy type | What it is | When to use |
|---|---|---|
| Target Tracking Policy | This scaling policy adds or removes capacity as required to keep the metric at, or close to, the specified target value | A use case is that you want to keep the aggregate CPU usage of your ASG at 70% |
| Simple Scaling Policy | Waits until health check and cool-down period expires before re-evaluating | This is a more conservative way to add/remove instances. Useful when the load is erratic. AWS recommends step scaling instead of simple in most cases |
| Step Scaling Policy | Increase or decrease the current capacity of your Auto Scaling group based on a set of scaling adjustments (known as step adjustments) | Useful when you want to vary adjustments based on the size of the alarm breach |
The diagram below depicts an Auto Scaling group with a scaling policy set to a minimum size of 1 instance, a desired capacity of 2 instances, and a maximum size of 4 instances:
Scaling based on Amazon SQS#
We can also scale based on an Amazon Simple Queue Service (SQS) queue.
|
📝 Exam Tip 📝
This comes up as an exam question for SAA-C02. |
It uses a custom metric that is sent to Amazon CloudWatch, which measures the number of messages in the queue per EC2 instance in the Auto Scaling group. You can then use a target tracking policy that configures your Auto Scaling group to scale based on the custom metric and a set target value. CloudWatch alarms invoke the scaling policy.
You can also use a custom “backlog per instance” metric to track not just the number of messages in the queue, but also the number available for retrieval. This can be based on the SQS Metric
ApproximateNumberOfMessages.
Network Load Balancer (NLB)
ASG Behaviour, Configuration, and Monitoring